Reduce number of zio free threads #534

nedbass · 2012-01-13T22:28:29Z

As described in Issue #458, unlinking large amounts of data can cause
the threads in the zio free wait queue to start spinning. Reducing
the number of z_fr_iss threads from a fixed value of 100 to 1 per cpu
signficantly reduces contention on the taskq spinlock and improves
throughput.

Instrumenting the taskq code showed that __taskq_dispatch() can spend
a long time holding tq->tq_lock if there are a large number of threads
in the queue. It turns out the time spent in wake_up() scales
linearly with the number of threads in the queue. When a large number
of short work items are dispatched, as seems to be the case with
unlink, the worker threads drain the queue faster than the dispatcher
can fill it. They then all pile into the work wait queue to wait for
new work items. So if 100 threads are in the queue, wake_up() takes
about 100 times as long, and the woken threads have to spin until the
dispatcher releases the lock.

Reducing the number of threads helps with the symptoms, but doesn't
get to the root of the problem. It would seem that wake_up()
shouldn't scale linearly in time with queue depth, particularly if we
are only trying to wake up one thread. In that vein, I tried making
all of the waiting processes exclusive to prevent the scheduler from
iterating over the entire list, but I still saw the linear time
scaling. So further investigation is needed, but in the meantime
reducing the thread count is an easy workaround.

As described in Issue openzfs#458 and openzfs#258, unlinking large amounts of data can cause the threads in the zio free wait queue to start spinning. Reducing the number of z_fr_iss threads from a fixed value of 100 to 1 per cpu signficantly reduces contention on the taskq spinlock and improves throughput. Instrumenting the taskq code showed that __taskq_dispatch() can spend a long time holding tq->tq_lock if there are a large number of threads in the queue. It turns out the time spent in wake_up() scales linearly with the number of threads in the queue. When a large number of short work items are dispatched, as seems to be the case with unlink, the worker threads drain the queue faster than the dispatcher can fill it. They then all pile into the work wait queue to wait for new work items. So if 100 threads are in the queue, wake_up() takes about 100 times as long, and the woken threads have to spin until the dispatcher releases the lock. Reducing the number of threads helps with the symptoms, but doesn't get to the root of the problem. It would seem that wake_up() shouldn't scale linearly in time with queue depth, particularly if we are only trying to wake up one thread. In that vein, I tried making all of the waiting processes exclusive to prevent the scheduler from iterating over the entire list, but I still saw the linear time scaling. So further investigation is needed, but in the meantime reducing the thread count is an easy workaround.

behlendorf · 2012-01-17T18:33:27Z

Merged as commit 08d08eb , we'll have to see if we can't still further improve things down in the taskq implementation.

This implementation of rw_tryupgrade() behaves slightly differently from its counterparts on other platforms. It drops the RW_READER lock and then acquires the RW_WRITER lock leaving a small window where no lock is held. On other platforms the lock is never released during the upgrade process. This is necessary under Linux because the kernel does not provide an upgrade function. There are currently no callers in the ZFS code where this change in behavior is a problem. In fact, in most cases the code is already written such that if the upgrade fails the RW_READER lock is dropped and the caller blocks waiting to acquire the lock as RW_WRITER. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Matthew Thode <[email protected]> Closes openzfs#4388 Closes openzfs#534

Mismerge introduced by openzfs#534 DLPX-82101

behlendorf closed this Jan 17, 2012

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023

DLPX-82101 Parallel object delete to accommodate GCP (openzfs#534)

8c0e559

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023

fix build (use anyhow)

0c5ae07

Mismerge introduced by openzfs#534 DLPX-82101

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce number of zio free threads #534

Reduce number of zio free threads #534

nedbass commented Jan 13, 2012

behlendorf commented Jan 17, 2012

Reduce number of zio free threads #534

Reduce number of zio free threads #534

Conversation

nedbass commented Jan 13, 2012

behlendorf commented Jan 17, 2012